Add LoRA handling for image generation by atobiszei · Pull Request #4084 · openvinotoolkit/model_server

atobiszei · 2026-03-25T10:42:44Z

This includes:
-> LoRA pulling
-> multiple LoRA handling
-> NPU LoRA handling

mask field should only be accepted in image edit (inpainting) requests, not in text-to-image generation requests.

…ting

…s and demo review update

…ting

…n, fix docs and includes

- Add --source_loras CLI parameter for specifying LoRA adapters Format: alias=org/repo@file.safetensors (comma-separated, @file optional) - Add LoRA adapter entries to image_gen_calculator.proto - Parse and validate LoRA settings in image_generation_graph_cli_parser - Export LoRA adapter entries in graph.pbtxt generation - Load LoRA .safetensors via ov::genai::Adapter in pipelines.cpp - Apply LoRA adapters at inference time based on model name routing - Download LoRA repos via curl (resolve safetensors filename from HF API) - Add LoRA alias routing in mediapipe factory - Pass modelName through HttpPayload for LoRA alias matching - Add 18 unit tests (CLI parsing, graph export, proto parsing, config)

@pokemon

- Support multiple LoRA source types: HF repo, direct URL, local file (alias= required) - Extract shared curl_downloader utility from gguf_downloader - Add composite LoRA aliases (e.g. blend=@pokemon:0.7+@anime:0.5) - Support per-request lora_weights override in extra_body - Local files referenced by absolute path in graph.pbtxt (no copy) - HF LoRA: resolve .safetensors via API, download with curl - clone() delegates to pullLoraAdapters() for all LoRA downloads - resolveHfLoraFilenames() + pullLoraAdapters() split (private -> protected) - Remove loraQueue: T2I/I2I always use clone(), only inpainting serialized - PipelineSlotGuard (renamed from InpaintingQueueGuard) - compileProperties built once in constructor (no default arg on reshapeAndCompile) - CompositeLoraMap type alias replaces duplicate runtime structs - Multiline composite formatting in graph.pbtxt - Add RUN_UNSTABLE-gated pull tests for LoRA (HF resolve, download, full-flow) - Add non-network unit tests (local file skip, non-imagegen no-op) - Add SetUpServerForDownloadWithLoras test helper - 59 tests pass (55 original + 4 new, 3 network-gated skip without RUN_UNSTABLE)

Resolved conflicts in: - pipelines.hpp: keep PipelineSlotGuard name and LoRA fields, adopt main's blocking comment - pipelines.cpp: keep LoRA adapter loading, compileProperties, adopt main's SPDLOG_ERROR - http_image_gen_calculator.cc: keep LoRA logic, adopt main's const ref for inpainting tensors - README.md: accept main's updated examples (model names, sizes, notes)

- Fix downloadFileWithCurl: use overload instead of const ref default parameter (was binding temporary to const std::string&) - Add HF_TOKEN auth header to curl downloads for HF repos only (avoid leaking credentials to arbitrary DIRECT_URL servers) - Rename authToken -> authTokenHF for clarity - Skip RUN_UNSTABLE tests when HF_TOKEN is not set - Provide explicit safetensors filename in download tests - Restore missing 'curl -O' PNG download commands in image_generation README - Update copilot-instructions rule 13: expanded dangling reference guidance - Add missing #include <vector>, <utility> (cpplint) - Fix comment spacing (cpplint) - clang-format all changed files

MSVC /W4 treats variable shadowing as error (C4456). Inner loop variable 'it' shadowed outer pipelinesMap iterator.

- Detect Windows absolute paths (e.g. C:\path\to\file.safetensors) in addition to Unix paths (/ and ./ prefixes) - Also detect .\ prefix for relative Windows paths - Use find_last_of("/\\") instead of rfind('/') to extract filename from both Unix and Windows paths

…tection

…ting_lora # Conflicts: # src/mediapipe_internal/mediapipegraphdefinition.cpp # src/mediapipe_internal/mediapipegraphdefinition.hpp # src/pull_module/BUILD # src/server.cpp # src/test/graph_export_test.cpp

…GraphSidePackets

…ll_hf_models - Fix demos/image_generation/README.md: use adapter alias as model name instead of base model + lora_weights for LoRA selection - Fix guidance_scale: 0 -> 0.0 (OVMS rejects integer values) - Fix docs/image_generation/reference.md: clarify model name routing as the adapter selection mechanism, document blending via composite adapters - Fix docs/model_server_rest_api_image_generation.md: clarify lora_weights only overrides weights of already-active adapters - Add docs/pull_hf_models.md: section on pulling image gen models with LoRA

…ting_lora

- Add static isValidLoraAlias() in CLI parser to sanitize LoRA alias names (alphanumeric, hyphens, underscores, dots only) - Add ServableNameChecker collision detection in mediapipefactory when registering LoRA aliases (reject if alias shadows model/pipeline/graph name) - Revert file_system_poll_wait_seconds default to 1 and sequence_cleaner_poll_wait_minutes default to 5 - Fix missing HfDownloaderPullHfModel test fixture after merge

- NPU detected: set AdapterConfig::MODE_STATIC, skip runtime adapter switching - Reject composite LoRA adapters on NPU (runtime switching unavailable) - Warn when multiple LoRAs configured on NPU (all compiled permanently) - Rename npuLoraFused -> npuLoraStaticMode for accuracy - Add CLI LoRA parsing tests: alias validation, source types, composites - Add pbtxt composite LoRA test in text2image_test - Add local file path tests (Unix absolute, Windows behind ifdef)

… weight->alpha - Add aliasesConflict() to ServableNameChecker interface for LoRA alias collision detection during graph validation (before factory lock) - Implement aliasesConflictExcluding() in MediapipeFactory with shared_lock - Validate aliases in mediapipegraphdefinition validate() after initializeNodes - Simplify createDefinition alias loop (checks moved to validate phase) - Update reloadDefinition to clear+re-register aliases on reload - NPU LoRA calculator rejection: reject requests to main graph name when npuLoraStaticMode is active (direct client to use alias) - Multi-LoRA NPU: require composite_lora_adapters definition (hard error) - Multi-LoRA NPU calculator: only composite aliases accepted as targets - Rename CompositeLoraComponent.weight -> alpha across proto/struct/CLI/export - Rename npuLoraFused -> npuLoraStaticMode - Register composite aliases for routing in image_gen_node_initializer - Fix fmt formatting of resolution_t in imagegen_init.cpp log statements

…ting_lora # Conflicts: # docs/pull_hf_models.md

Add LoraLoadMode enum to proto and C++ to support different LoRA loading strategies: - DYNAMIC (default): Runtime-switchable adapters - STATIC: Static rank compilation - FUSE: Permanently merge LoRA into base weights at compile time FUSE adapters are compiled separately and excluded from runtime alias registration. DYNAMIC/STATIC adapters remain switchable at generate time. Also fixes composite LoRA alias registration (skip FUSE adapters) and adds tests for the new functionality.

- Rename expectedImageGenNpuFuse → expectedImageGenNpuStatic in tests - Fix NPU error message: 'fused' → 'static' in imagegen_init.cpp - Fix CLI parser comment: clarify STATIC mode for NPU adapters - All 44 LoRA tests passing

Copilot

Pull request overview

This PR extends OVMS image generation to support LoRA adapters end-to-end: CLI parsing (--source_loras) and graph export, LoRA download during HF pull, pipeline compilation with adapters, per-request adapter selection via model name routing (including composites), and alias-based routing/visibility in the MediaPipe factory.

Changes:

Add LoRA adapter definitions (single + composite) to ImageGen graph proto, parsing, and graph export/CLI plumbing.
Implement HF pull support for LoRA adapters (HF repo resolution + download; direct URL/local path support via CLI parsing).
Add runtime routing support for LoRA aliases (MediaPipe alias registration/hide-base-model behavior) and request-level lora_weights overrides.

Reviewed changes

Copilot reviewed 44 out of 44 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
src/test/text2image_test.cpp	Adds pbtxt parsing tests for LoRA adapter fields.
src/test/test_utils.hpp	Declares new server test helpers for pull/start with LoRAs and REST port.
src/test/test_utils.cpp	Implements new server test helpers (threaded start with LoRA args).
src/test/pull_hf_model_test.cpp	Adds HF pull + LoRA tests and a large unstable pull/serve/generate integration test.
src/test/ovmsconfig_test.cpp	Adds config parsing tests for invalid/valid `--source_loras` combinations.
src/test/graph_export_test.cpp	Adds extensive graph export + CLI-to-settings tests for LoRA/composites/source types.
src/stringutils.hpp	Declares `isLocalFilePath`.
src/stringutils.cpp	Implements `isLocalFilePath` (Unix + Windows absolute + ./ .\).
src/server.cpp	Adjusts HF pull module casting to call non-const `clone()`.
src/servable_name_checker.hpp	Extends checker interface with alias-conflict detection.
src/pull_module/hf_pull_model_module.hpp	Makes `clone()` non-const; exposes LoRA resolve/pull helpers (protected).
src/pull_module/hf_pull_model_module.cpp	Adds HF API resolution for LoRA safetensors + downloads during `clone()`.
src/pull_module/gguf_downloader.cpp	Refactors curl download logic to shared curl downloader helper.
src/pull_module/curl_downloader.hpp	New shared curl download helper API.
src/pull_module/curl_downloader.cpp	New curl downloader implementation (progress + optional auth header).
src/pull_module/BUILD	Adds curl_downloader target; wires into pull module deps.
src/modelmanager.hpp	Implements new `aliasesConflict` API.
src/modelmanager.cpp	Adds alias conflict checks across models/pipelines/mediapipe definitions.
src/mediapipe_internal/mediapipegraphdefinition.hpp	Stores discovered LoRA aliases + hide-base-model flag.
src/mediapipe_internal/mediapipegraphdefinition.cpp	Validates LoRA alias conflicts; propagates LoRA routing metadata from node init.
src/mediapipe_internal/mediapipefactory.hpp	Adds alias→graph mapping and helper methods.
src/mediapipe_internal/mediapipefactory.cpp	Registers LoRA aliases for lookup/listing; hides base model when requested.
src/mediapipe_internal/graph_side_packets.hpp	Extends side packets with LoRA aliases and hide-base-model flag.
src/image_gen/pipelines.hpp	Adds adapter/composite storage; renames queue guard to `PipelineSlotGuard`.
src/image_gen/pipelines.cpp	Loads adapters and compiles pipelines with adapter properties; tracks NPU/static mode.
src/image_gen/imagegenutils.cpp	Allows `lora_weights` in accepted request fields.
src/image_gen/imagegenpipelineargs.hpp	Adds LoRA adapter + composite settings to pipeline args.
src/image_gen/imagegen_init.cpp	Parses LoRA adapter and composite entries from ImageGenCalculatorOptions.
src/image_gen/image_gen_node_initializer.cpp	Registers LoRA aliases into graph side packets; sets hide-base-model.
src/image_gen/image_gen_calculator.proto	Adds LoRA adapter and composite adapter proto fields + load mode enum.
src/image_gen/http_image_gen_calculator.cc	Applies LoRA selection per request (model routing) + optional `lora_weights`.
src/http_rest_api_handler.cpp	Persists resolved model name into `HttpPayload`.
src/http_payload.hpp	Adds `modelName` to payload for downstream routing logic.
src/graph_export/image_generation_graph_cli_parser.cpp	Adds `--source_loras` parsing (repo/url/local + composites + alpha).
src/graph_export/graph_export.cpp	Emits LoRA adapter entries (and composite entries) into generated graph.pbtxt.
src/cli_parser.cpp	Adds `--source_loras` CLI option and stores it in HF settings.
src/capi_frontend/server_settings.hpp	Adds LoRA settings types + `HFSettingsImpl::sourceLoras`.
src/BUILD	Adds cpp-httplib + image_generation_graph_cli_parser deps to tests.
docs/pull_hf_models.md	Documents `--source_loras` for pull mode.
docs/model_server_rest_api_image_generation.md	Documents `lora_weights` request field.
docs/image_generation/reference.md	Adds LoRA adapter usage docs (routing, composites, overrides).
demos/image_generation/README.md	Adds Multi-LoRA serving examples; improves inpainting/outpainting notes.
demos/common/export_models/export_model.py	Adds `--source_loras` support for exporting image generation configs (with LoRA download).
.github/copilot-instructions.md	Updates guidance about avoiding dangling refs in default args.

Comments suppressed due to low confidence (1)

src/test/test_utils.cpp:850

This overload builds argv using port.c_str() where port is a local variable, and argv itself is a stack array. The server thread may outlive this function, so the argument pointers can dangle (use-after-scope). Please ensure argument storage outlives the thread (heap-owned vectors captured by value).

rasapala · 2026-05-18T13:07:24Z

+    // All adapters were registered at compile time (alpha=1.0 each).
+    // At generate time we must explicitly set the adapter config:
+    //   - If modelName matches a composite alias: activate all component adapters with their weights.
+    //   - If modelName matches a single adapter alias: activate that adapter.
+    //   - Otherwise: disable all adapters (alpha=0) so the base model runs clean.
+    // lora_weights from request body can override default weights.


It is already updated.

- Add validateLoraAdapterConfig() for alpha consistency between individual and composite levels (error if both non-default) - NPU validation: composites required for multi-LoRA, all adapters must be referenced, consistent alpha across composites - Fix Windows drive letter colon in CLI parser (lastColon > 1) - Document LoRA adapter modes (DYNAMIC/STATIC/FUSE) in reference.md - Document --source_loras format, alpha, source type detection - Add tests: alpha at individual/composite/both levels, explicit 1.0, Windows absolute path with alpha

atobiszei · 2026-05-18T11:44:31Z

+// See the License for the specific language governing permissions and
+// limitations under the License.
+//*****************************************************************************
+#include "curl_downloader.hpp"


Extracted from GGUF_downloader

Those are almost the same. Can we make a base curlDownloader class ?

The common functionality is already extracted into the downloadFileWithCurl() free function in curl_downloader.cpp, which both GGUF and LoRA paths call. The orchestration around it differs: GGUF handles multi-part file resolution and overwrite-remove logic; LoRA handles source-type dispatch (HF repo / URL / local),

atobiszei · 2026-05-18T12:56:09Z

+**FUSE mode:**
+- The adapter is merged into base weights during model compilation using `MODE_FUSE`.
+- It is always active — the base model without the adapter is **not accessible**.
+- Does not appear in the list of routable adapters and cannot be selected or deselected via the `model` field.


Only adapter is available.

rasapala · 2026-05-18T13:22:17Z

    SttServableMap sttServableMap;
    TtsServableMap ttsServableMap;
+    std::vector<std::string> loraAliases;
+    bool hideBaseModel = false;


Please describe what is it used for.

rasapala · 2026-05-18T13:31:08Z

    ASSERT_EQ(std::get<Status>(res), ovms::StatusCode::PLUGIN_CONFIG_CONFLICTING_PARAMETERS);
 }
+
+// ===================== LoRA Graph Export Tests =====================


I suggest adding a new file for those specific lora tests. I do not see if we reuse much from this file ?

removeVersionString().

I see it is already created in 2 places (both graph_export & hf_pull tests so I will extract it and share across all thre files then)

rasapala · 2026-05-18T13:34:16Z

    uint16_t n = 3;
    testResponseFromOvTensor(n);
 }
+// ===================== LoRA Proto Parsing Tests =====================


I suggest adding new file with this tests.

In this case I am not convinced - loras are mainly for image generation and here we test basically the same (proto parsing)

atobiszei · 2026-05-18T13:38:55Z

+    std::vector<std::string> loraAliases_;
+    bool hideBaseModel_ = false;


Dispose "_"

…x docs tab-set

atobiszei added 24 commits March 2, 2026 16:28

WIP

3ddabdc

Inpainting/outpainting CPU

b5ed7d8

Update dockerignore

c2ba9d8

Demo

af51476

Change mask

5f89fc2

Fix

102f8af

Fix: remove mask from accepted fields in text2image request options

330374f

mask field should only be accepted in image edit (inpainting) requests, not in text-to-image generation requests.

Merge remote-tracking branch 'origin/main' into atobisze_image_inpain…

0a65e68

…ting

Fix concurrent request inpainting issue, propagate quantization param…

eac3932

…s and demo review update

Merge remote-tracking branch 'origin/main' into atobisze_image_inpain…

f4be12a

…ting

Add tests & review fix

a2476ee

Address PR review: blocking inpainting guard, string_view optimizatio…

15dd08a

…n, fix docs and includes

Minor comment fix

7e392c6

Add download commands for inpainting/outpainting demo images

9da0860

Fix Windows build: rename shadowed variable 'it' to 'member'

4fe7c75

MSVC /W4 treats variable shadowing as error (C4456). Inner loop variable 'it' shadowed outer pipelinesMap iterator.

Extract isLocalFilePath() into stringutils for cross-platform path de…

519ea61

…tection

Use generic_string() for cross-platform path separators in imagegen_init

c288736

Merge remote-tracking branch 'origin/main' into atobisze_image_inpain…

b1d2ca6

…ting_lora # Conflicts: # src/mediapipe_internal/mediapipegraphdefinition.cpp # src/mediapipe_internal/mediapipegraphdefinition.hpp # src/pull_module/BUILD # src/server.cpp # src/test/graph_export_test.cpp

Fix build after merge: filesystem path, const cast, LoRA aliases via …

453f712

…GraphSidePackets

dtrawins added this to the 2026.2_rc milestone May 8, 2026

atobiszei added 5 commits May 13, 2026 10:29

Merge remote-tracking branch 'origin/main' into atobisze_image_inpain…

8e36a01

…ting_lora

atobiszei added 3 commits May 14, 2026 16:43

Merge remote-tracking branch 'origin/main' into atobisze_image_inpain…

d76da73

…ting_lora # Conflicts: # docs/pull_hf_models.md

atobiszei marked this pull request as ready for review May 18, 2026 07:34

Copilot AI review requested due to automatic review settings May 18, 2026 07:34

Copilot started reviewing on behalf of atobiszei May 18, 2026 07:35 View session

Copilot AI reviewed May 18, 2026

View reviewed changes

atobiszei changed the title ~~WIP LORA image generation~~ Add LoRA handling for image generation May 18, 2026

Self-review

a705a22

atobiszei commented May 18, 2026

View reviewed changes

self-review part 2

34ce570

atobiszei commented May 18, 2026

View reviewed changes

Comment thread demos/common/export_models/export_model.py Outdated

atobiszei commented May 18, 2026

View reviewed changes

rasapala reviewed May 18, 2026

View reviewed changes

rasapala requested changes May 18, 2026

View reviewed changes

atobiszei commented May 18, 2026

View reviewed changes

Split LoRA tests, share removeVersionString, rename hideBaseModel, fi…

e38114a

…x docs tab-set

		std::vector<std::string> loraAliases_;
		bool hideBaseModel_ = false;

Conversation

atobiszei commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rasapala May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

atobiszei commented Mar 25, 2026 •

edited

Loading

rasapala May 18, 2026 •

edited

Loading